NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

SYNAPSE: SYmbolic Neural-Aided Preference Synthesis Engine

https://doi.org/10.1609/aaai.v39i26.34965

Modak, Sadanand; Patton, Noah Tobias; Dillig, Isil; Biswas, Joydeep (April 2025, Proceedings of the AAAI Conference on Artificial Intelligence)

This paper addresses the problem of preference learning, which aims to align robot behaviors through learning user-specific preferences (e.g. “good pull-over location”) from visual demonstrations. Despite its similarity to learning factualconcepts (e.g. “red door”), preference learning is a fundamentally harder problem due to its subjective nature and the paucity of person-specific training data. We address this problem using a novel framework called SYNAPSE, which is aneuro-symbolic approach designed to efficiently learn preferential concepts from limited data. SYNAPSE represents preferences as neuro-symbolic programs – facilitating inspection of individual parts for alignment – in a domain-specificlanguage (DSL) that operates over images and leverages a novel combination of visual parsing, large language models, and program synthesis to learn programs representing individual preferences. We perform extensive evaluations on various preferential concepts as well as user case studies demonstrating its ability to align well with dissimilar user preferences. Our method significantly outperforms baselines, especially when it comes to out-of-distribution generalization. We show the importance of the design choices in the framework through multiple ablation studies.
more » « less
Free, publicly-accessible full text available April 11, 2026
Copper and Wire: Bridging Expressiveness and Performance for Service Mesh Policies

https://doi.org/10.1145/3669940.3707257

Saxena, Divyanshu; Zhang, William; Pailoor, Shankara; Dillig, Isil; Akella, Aditya (March 2025, ACM)

Free, publicly-accessible full text available March 30, 2026
SYNAPSE: SYmbolic Neural-Aided Preference Synthesis Engine

Modak, Sadanand; Patton, Noah_Tobias; Dillig, Isil; Biswas, Joydeep (January 2025, AAAI 25)

This paper addresses the problem of preference learning, which aims to align robot behaviors through learning userspecific preferences (e.g. “good pull-over location”) from visual demonstrations. Despite its similarity to learning factual concepts (e.g. “red door”), preference learning is a fundamentally harder problem due to its subjective nature and the paucity of person-specific training data. We address this problem using a novel framework called SYNAPSE, which is a neuro-symbolic approach designed to efficiently learn preferential concepts from limited data. SYNAPSE represents preferences as neuro-symbolic programs – facilitating inspection of individual parts for alignment – in a domain-specific language (DSL) that operates over images and leverages a novel combination of visual parsing, large language models, and program synthesis to learn programs representing individual preferences. We perform extensive evaluations on various preferential concepts as well as user case studies demonstrating its ability to align well with dissimilar user preferences. Our method significantly outperforms baselines, especially when it comes to out-of-distribution generalization. We show the importance of the design choices in the framework through multiple ablation studies.
more » « less
Full Text Available
Dynamic Model Predictive Shielding for Provably Safe Reinforcement Learning

Banerjee, Arko; Rahmani, Kia; Biswas, Joydeep; Dillig, Isil (December 2024, Neurips 2024)

Among approaches for provably safe reinforcement learning, Model Predictive Shielding (MPS) has proven effective at complex tasks in continuous, high-dimensional state spaces, by leveraging a backup policy to ensure safety when the learned policy attempts to take risky actions. However, while MPS can ensure safety both during and after training, it often hinders task progress due to the conservative and task-oblivious nature of backup policies. This paper introduces Dynamic Model Predictive Shielding (DMPS), which optimizes reinforcement learning objectives while maintaining provable safety. DMPS employs a local planner to dynamically select safe recovery actions that maximize both short-term progress as well as long-term rewards. Crucially, the planner and the neural policy play a synergistic role in DMPS. When planning recovery actions for ensuring safety, the planner utilizes the neural policy to estimate long-term rewards, allowing it to observe beyond its short-term planning horizon. Conversely, the neural policy under training learns from the recovery plans proposed by the planner, converging to policies that are both high-performing and safe in practice. This approach guarantees safety during and after training, with bounded recovery regret that decreases exponentially with planning horizon depth. Experimental results demonstrate that DMPS converges to policies that rarely require shield interventions after training and achieve higher rewards compared to several state-of-the-art baselines
more » « less
Full Text Available
Dynamic Model Predictive Shielding for Provably Safe Reinforcement Learning

Banerjee, Arko; Rahmani, Kia; Biswas, Joydeep; Dillig, Isil (December 2024, Advances in Neural Information Processing Systems (Neurips))

Full Text Available
Programmatic Imitation Learning From Unlabeled and Noisy Demonstrations

https://doi.org/10.1109/LRA.2024.3385691

Xin, Jimmy; Zheng, Linus; Rahmani, Kia; Wei, Jiayi; Holtz, Jarrett; Dillig, Isil; Biswas, Joydeep (June 2024, IEEE Robotics and Automation Letters)

Full Text Available
SatLM: Satisfiability-Aided Language Models Using Declarative Prompting

Ye, Xi; Chen, Qiaochu; Dillig, Isil; Durrett, Greg (December 2023, Advances in neural information processing systems)

Prior work has combined chain-of-thought prompting in large language models (LLMs) with programmatic representations to perform effective and transparent reasoning. While such an approach works well for tasks that only require forward reasoning (e.g., straightforward arithmetic), it is less effective for constraint solving problems that require more sophisticated planning and search. In this paper, we propose a new satisfiability-aided language modeling (SatLM) approach for improving the reasoning capabilities of LLMs. We use an LLM to generate a declarative task specification rather than an imperative program and leverage an off-the-shelf automated theorem prover to derive the final answer. This approach has two key advantages. The declarative specification is closer to the problem description than the reasoning steps are, so the LLM can parse it out of the description more accurately. Furthermore, by offloading the actual reasoning task to an automated theorem prover, our approach can guarantee the correctness of the answer with respect to the parsed specification and avoid planning errors in the solving process. We evaluate SATLM on 8 different datasets and show that it consistently outperforms program-aided LMs in the imperative paradigm. In particular, SATLM outperforms program-aided LMs by 23% on a challenging subset of the GSM arithmetic reasoning dataset; SATLM also achieves a new SoTA on LSAT and BoardgameQA, surpassing previous models that are trained on the respective training sets.
more » « less
Full Text Available
Guiding Safe Exploration with Weakest Preconditions

Anderson, Greg; Chaudhuri, Swarat; Dillig, Isil (January 2023, International Conference on Learning Representations (ICLR))

In reinforcement learning for safety-critical settings, it is often desirable for the agent to obey safety constraints at all points in time, including during training. We present a novel neurosymbolic approach called SPICE to solve this safe exploration problem. SPICE uses an online shielding layer based on symbolic weakest preconditions to achieve a more precise safety analysis than existing tools without unduly impacting the training process. We evaluate the approach on a suite of continuous control benchmarks and show that it can achieve comparable performance to existing safe learning techniques while incurring fewer safety violations. Additionally, we present theoretical results showing that SPICE converges to the optimal safe policy under reasonable assumptions.
more » « less
Full Text Available
Chipmunk: Investigating Crash-Consistency in Persistent-Memory File Systems

https://doi.org/10.1145/3552326.3567498

LeBlanc, Hayley; Pailoor, Shankara; K R E, Om Saran; Dillig, Isil; Bornholt, James; Chidambaram, Vijay (May 2023, ACM)

We present Chipmunk, a new framework to test persistent-memory (PM) file systems for crash-consistency bugs. Using Chipmunk, we discovered 23 new bugs across five PM file systems; most bugs have been confirmed and fixed by developers. The discovered bugs have serious consequences, including making the file system un-mountable or breaking rename atomicity. We present a detailed study of the bugs found using Chipmunk and discuss important lessons learned for designing and testing PM file systems.
more » « less
Full Text Available
STEADY: Simultaneous State Estimation and Dynamics Learning from Indirect Observations

https://doi.org/10.1109/IROS47612.2022.9981279

Wei, Jiayi; Holtz, Jarrett; Dillig, Isil; Biswas, Joydeep (October 2022, Intelligent Robots and Systems (IROS), IEEE/RSJ International Conference on)

Full Text Available

« Prev Next »

Search for: All records